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19.  ABSTRACT  (Continued) 

the  respiratory  sound  data  (MPP,  PPK,  FMAX)  and  the  mean  flow-rate  data  are 
the  variables  used  in  the  cluster  analysis.  The  MPF  refers  to  the  mean 
frequency  of  the  power  spectra,  FPK  is  the  frequency  of  the  power, 
and  FMAX  refers  to  the  highest  frequency  at  which  the  power  in  the  spectrum 
equals  or  is  less  than  10%  of  the  maximum  power. 

Results  indicate  that  clustering  the  data  according  to  the  indices  MPP  and 
FMAX  appears  to  cluster  the  data  into  3  groups.  However,  some  combinations 
appear  to  show  3  clusters;  others  appear  to  show  4  clusters,  and  some  do 
not  reveal  any  distinct  clusters  at  all.  Xn  summary,  from  the  cluster 
plots,  we  conclude  that  the  data  may  be  clustered  into  3  or  4  groups. 
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CLUSTER  ANALYSIS  OF  RESPIRATORY  SOUNDS  OF  PULMONARY 
INSUFFICIENT  PATIENTS  AND  NORMAL  SUBJECTS 


INTRODUCTION 


Respiration  is  one  of  the  physiological  functions  of  concern  when  a 
patient  is  under  examination  or  treatment.  A  clinical  relationship  between 
respiratory  sounds  and  gross  respiratory  pathology  was  established  in  the 
nineteenth  century.  Auscultation  of  respiratory  sounds,  however,  is  very 
subjective.  Due  to  the  subjectiveness,  there  are  varying  degrees  of 
acceptance  of  respiratory  sounds  as  a  clinical  sign.  To  alleviate  the 
problem,  various  researchers  have  studied  respiratory  sounds  to  explore  and 
develop  automated  methods  for  analysis  and  diagnosis  of  pulmonary  diseases. 

Our  objective  is  to  determine  whether  respiratory  sound  data  of  normal 
volunteers  and  pulmonary  insufficiency  subjects  reveals  groupings  or 
clusters  of  the  data.  Patient  data  was  obtained  from  the  USAF  School  of 
Aerospace  Medicine,  Brooks  Air  Force  Base,  Texas.  The  study  is  a  "’’blind'1' 
study  since  the  classification  (normal/pulmonary  insufficiency)  of  the 
subjects  was  not  revealed. 


Background 

The  process  of  respiration  is  of  vital  importance  to  life  and  includes 
the  following  mechanistic  events:  (1)  pulmonary  ventilation,  the  inflow  and 
outflow  of  air  between  the  atmosphere  and  the  lung  alveoli,  (2)  diffusion 
of  oxygen  and  carbon  dioxide  between  the  alveoli  and  the  blood,  and  (3) 
transportation  of  oxygen  in  the  blood,  principally  in  combination  with 
hemoglobin,  to  the  tissue  capillaries  where  it  is  released  for  use  by  the 
cells  according  to  their  metabolic  needs  [19]. 

The  foundations  of  respiratory  medicine  were  laid  at  the  beginning  of 
the  nineteenth  century  when  Laennec  established  the  clinical  relationship 
between  respiratory  sound  and  gross  pulmonary  pathology  by  the  use  of  the 
early  stethoscope  [22].  Auscultation  in  respiratory  medicine,  however,  has 
advanced  slowly  since  Laennec  established  auscultation  of  lung  sounds  as  a 
means  of  diagnosing  the  condition  of  the  lungs. 

The  slow  progress  is  due  to:  (1)  a  variety  of  human  factor  problems, 
(2)  the  limitations  of  the  instrumentation,  (3)  the  lack  of  total 
understanding  of  the  mechanism  of  production  of  respiratory  sounds,  and  (4) 
the  lack  of  understanding  of  the  origin  of  the  source  of  the  sounds. 


Human  Factor  Problems 

Respiratory  sounds  heard  through  a  stethoscope  can  be  roughly 
classified  into  two  major  types.  The  first  major  type  of  respiratory 
sounds  are  normal  respiratory  sounds.  These  sounds  are  both  inspiratory 
and  expiratory  sounds  heard  as  the  air  moves  in  and  out  of  the  chest  during 

1 


»j|  i'r  mt  i 


*j 


« 

I 


normal  breathing.  There  are  two  types  of  normal  respiratory  sounds. 
Tracheal  or  bronchial  respiratory  sounds  are  heard  by  placing  the 
stethoscope  over  the  trachea  and  listening  as  the  patient  breathes  in  and 
out  with  the  mouth  open.  The  sound  is  described  as  "tubular"  and  is 
similar  to  the  sound  that  arises  when  air  is  blown  through  a  tube.  The 
second  major  type  of  respiratory  sound  is  called  vesicular.  The  term 
"vesicular"  in  Latin  refers  to  little  vessels.  This  description  refers  to 
the  sound  that  is  heard  over  the  majority  of  the  chest  of  normal  persons 
during  normal  breathing.  The  analogy  used  is  the  sound  heard  by  the  rustle 
of  wind  in  the  trees  [11]. 

The  term  "adventitious”  is  used  to  describe  sounds  not  expected  in  the 
normal  chest.  To  complicate  matters,  the  terminology  of  adventitious 
sounds  is  not  standard.  The  clinician  attempts  to  describe  the  quality  of 
sounds  by  adjectives  that  convey  an  idea  of  relative  intensity  and  pitch. 
Different  clinicians  use  the  same  term  to  describe  dissimilar  sounds 
[3.11]. 

Adventitious  sounds  are  divided  into  those  believed  to  have  a 
bronchopulmonary  origin  and  those  thought  to  be  due  to  pleural  disease. 
Those  of  bronchopulmonary  origin  are  further  subdivided  into  "continuous" 
and  "discontinuous".  Sounds  that  last  for  more  than  a  tiny  fraction  of  the 
respiratory  cycle  are  referred  to  as  continuous.  Sales,  the  discontinuous 
sounds,  are  further  subdivided  into  fine,  medium,  and  coarse.  A  variety  of 
adjectives  appear  in  the  medical  literature  to  classify  these  sounds 
further.  Examples  of  these  adjectives  include  dry,  wet,  moist,  bubbling, 
crepitant,  subcrepitant,  and  consonnating.  The  terminology  is  subjective 
in  its  interpretation  depending  greatly  on  the  hearing  and  experience  of 
the  clinician  [3,5,14,15,27-29]. 


Another  problem  in  chest  auscultation  is  that  so  much  information 
exists  that  it  is  difficult  either  to  record  it  properly  or  to  remember  the 
observed  details.  By  listening  over  a  single  site  in  the  chest,  it  is 
possible  to  observe  the  intensity  of  both  inspiration  and  expiration.  It 
is  also  possible  to  grade  these  on  a  scale  that  may  reflect  normality  or 
abnormality  of  the  site.  The  clinician  may  also  be  able  to  record  the 
presence  or  absence  of  various  adventitious  sounds  and  their  relationship 
to  the  respiratory  cycle.  The  duration  of  inspiration  with  respect  to 
expiration  may  also  be  noted.  If  the  clinician  listens  to  one  or  more 
sites,  as  is  common  in  routine  chest  auscultation,  then  there  is  a 
possibility  that  information  may  be  lost  due  to  various  factors  such  as 
interruptions  in  the  physical  examination,  lack  of  accurate  record  keeping, 
or  the  clarity  of  the  clinician's  initial  observations  [27]. 

The  diversity  of  the  terms  used  to  describe  respiratory  sounds, 
together  with  the  nonuniformity  of  their  usage  by  clinicians,  poses 
difficulties  in  the  use  of  respiratory  sounds  as  a  precise  indicator  of  the 
condition  of  the  respiratory  system. 
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Limitations  of  Instrumentation 


The  binaural  stethoscope  appeared  towards  the  middle  of  the  nineteenth 
century  and  became  popular  mainly  because  it  excludes  extraneous  noise 
[17,26].  The  choice  of  the  best  chest  piece  remained  controversial  until 
physicians  agreed  that  both  the  diaphragm  and  the  bell  were  necessary  for 
auscultation  of  the  heart;  they  are  combined  in  most  stethoscopes  now  in 
use. 

The  stethoscope  transmits  the  range  of  frequencies  which  includes 
frequencies  of  heart  and  lung  sounds.  By  varying  the  pressure  between  the 
chest  piece  and  the  skin,  the  intensities  of  certain  frequencies  are 
increased  while  others  are  decreased.  Low  pitched  heart  sounds  are  heard 
best  with  the  bell  resting  lightly  on  the  skin,  while  firm  pressure  of  the 
bell  or  diaphragm  increases  the  intensity  of  higher  frequencies  and 
suppresses  unwanted  low  pitched  sounds. 

Respiratory  sounds  contain  a  wide  range  of  frequencies.  To  compare 
relative  frequency  intensities  within  a  particular  sound  spectrum,  the 
measuring  instrument  should  not  contribute  variations  in  intensity.  The 
conventional  stethoscope  exhibits  this  limitation  [13,27]. 

Sound  evaluation  is  further  complicated  by  the  nonlinearity  of  the 
human  auditory  system.  The  ear  is  capable  of  distinguishing  small 
differences  in  pitch.  As  the  intensity  of  the  sound  increases,  the 
sensitivity  of  the  ear  to  intensity  variations  decreases  logarithmically. 
The  ears'  perception  of  intensity  falls  off  at  both  ends  of  the  frequency 
spectrum  [13].  The  frequencies  of  sound  that  a  young  person  can  hear 
differ  from  the  frequencies  of  sound  that  an  older  person  can  hear.  The 
range  falls  between  30  and  20  thousand  cycles  per  second  (cps)  for  a  young 
person  and  50  to  8  thousand  cps  in  old  age  [18].  The  ear  is  unable  tc 
distinguish  short  sound  bursts.  A  burst  shorter  than  3  ms  will  be  heard  only 
as  a  click  regardless  of  the  frequency  [15]. 

Electronic  instruments  can  be  designed  to  exhibit  a  flat  frequency 
response  over  the  range  of  respiratory  sound  spectrum  and  thus  overcome 
some  of  the  instrumentation  shortcomings.  The  major  obstacles,  however,  in 
the  acceptance  and  advancement  of  using  respiratory  sounds  as  a  major 
clinical  tool  in  pulmonary  medicine  are  the  lack  of  complete  understanding 
of:  (1)  the  mechanism  of  production  of  respiratory  sounds,  and  (2)  the 
sources  from  which  respiratory  sounds  are  generated. 


Mechanisms  and  Source  of  Respiratory  Sounds 

Numerous  theories  exist  on  the  source  of  respiratory  sounds  and  the 
mechanisms  by  which  they  are  produced.  In  1884,  Bullar  [2]  performed  an 
experiment  with  an  exteriorized  lung  of  a  calf.  The  left  lung  was  enclosed 
in  an  air-tight,  fluid-filled  chamber  with  glass  sides,  and  the  right  lung 
remained  outside  the  thorax  in  a  collapsed  state.  The  pressure  surrounding 
the  left  lung  was  lowered,  thus  simulating  inspiration  in  the  left  lung. 
During  the  simulated  inspiration,  a  bronchial  breathing  sound  was  heard 
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over  the  right  lung.  Bullar  then  plugged  the  trachea  and  forced  air  out  of 
the  left  lung  into  the  right  lung  and  noted  that  a  vesicular  inspiratory 
murmur  was  heard  over  the  right  lung.  He  also  showed  that  without  air  flow 
no  sounds  were  heard.  Bullar' s  conclusion  was  that  air  currents  that 
passed  over  the  main  bronchus  of  the  outside  lung  generated  bronchial 
sounds  and  air  that  passed  from  narrower  to  wider  passages  inside  the  lung 
generated  expiratory  sounds. 

Bushnell  [4]  claimed  that  the  sounds  of  expiration  originated  in  the 
larynx,  and  that  the  sounds  of  inspiration  originated  partly  in  the  larynx 
and  partly  in  the  alveoli. 

Martini  and  Muller  [24]  believed  that  the  bronchial  network  of  the 
lungs  was  responsible  for  the  generation  of  respiratory  sounds.  They 
showed  that  each  generation  of  bronchus  up  to  the  generation  with  inner 
caliber  of  3  mm  had  its  own  specific  frequency  of  vibration.  During 
respiration  the  bronchi  would  vibrate  at  their  specific  frequency.  These 
vibrations  acted  on  the  lung  tissues  and  on  the  chest  wall. 

Forgacs  et  al.  [14]  performed  an  experiment  on  asthmatic  and  chronic 
bronchitic  patients.  Measurements  were  taken  on  these  patients  while 
breathing  both  air  and  a  mixture  of  79%  helium  and  21%  oxygen.  They 
found  that  the  respiratory  sound  intensity  of  the  patients  was  lower  while 
breathing  the  mixture  as  opposed  to  breathing  air.  The  respiratory  sounds 
were  silenced  by  the  helium  mixture  because  flow  is  less  turbulent  in  gas¬ 
es  of  low  density. 

At  very  low  inspiratory  flow  rates  the  flow  of  air  in  the  trachea  and 
bronchi  is  laminar  and  presumably  silent.  At  higher  flow  rates  turbulence 
sets  in  and  some  of  the  kinetic  energy  of  flow  is  then  converted  into  heat 
and  sound.  In  engineering  problems  the  transition  from  laminar  to 
turbulent  flow  is  predictable  by  the  Reynolds  number,  calculated  from  the 
density  and  viscosity  of  the  gas,  the  dimensions  of  the  pipe,  and  the  flow 
velocity.  This  prediction,  however,  is  less  reliable  when  applied  to  a 
complicated  system  of  branching  pipes  like  the  bronchial  tree.  Forgacs  et 
al.  [14]  concluded  that  respiratory  sounds  were  generated  in  the  turbulent 
zone  of  the  bronchial  tree. 

Calculations  based  on  the  Reynolds  number  and  experimental  observations 
of  gas  flow  in  models  of  the  bronchi  suggest  that  flow  is  turbulent  in  the 
trachea  and  the  first  few  generations  of  the  bronchi,  and  laminar  in  the 
peripheral  bronchi  where  the  Reynolds  number  is  less  than  one.  Between 
these  two  regions  there  lies  an  intermediate  zone,  extending  from  the 
segmental  bronchi  to  the  fifteenth  generation  of  the  airways,  where  the 
laminar  flow  pattern  is  disrupted  by  vortices  [8,16,21,33,34]. 

Hardin  and  Patterson  [20]  claimed  that  the  production  of  respiratory 
sounds  was  primarily  by  vortices  and  that  turbulence  played  only  a  minor 
role.  When  a  stream  of  gas  emerging  from  a  si-t  or  a  circular  orifice 
enters  a  wider  channel,  a  shearing  force  arises  at  the  boundary  between  the 
jet  and  the  surrounding  fluid.  The  circular  motion  set  up  by  this  force 
generates  whirlpools  or  vortices,  shed  alternately  from  opposite  sides  of 
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the  jet.  Similar  vortices  are  produced  where  a  curvature  or  angle  in  the 
pipe  forces  flow  to  change  directions  abruptly.  The  gas  stream  separates 
into  layers  that  move  forward  at  different  velocities.  The  slower  streams 
are  turned  into  circular  motion  by  the  shearing  force  of  the  high  velocity 
streamlines  flowing  alongside.  This  motion  creates  vortices,  usually  in 
pairs,  whirling  in  opposite  directions.  Like  turbulence,  vortex  formation 
begins  when  the  Reynolds  number  reaches  a  critical  value.  Above  this 
number  the  rate  of  formation  of  vortices  depends  on  flow  velocity  alone. 
The  distance  between  vortices  carried  downstream  by  the  flow  of  gas  varies 
at  random,  so  that  the  resulting  sound  is  a  noise  with  a  wide  frequency 
spectrum  [ 16 ] . 

The  source  of  production  of  the  "vesicular"  (normal)  respiratory  sound 
has  been  a  matter  of  debate  for  more  than  100  years.  Bullar  [2]  proposed 
that  the  production  of  the  inspiratory  component  was  attributed  to  the 
alveoli.  Bushnell  [4]  and  Bates  [1],  however,  proposed  that  the  production 
of  the  inspiratory  component  was  attributed  to  the  larynx.  Forgacs  [14] 
stated  that  the  airways  in  between  the  larynx  and  alveoli  were  the  source 
of  production  of  the  inspiratory  component. 

Recent  studies  [23,31,32]  indicate  that  the  inspiratory  vesicular  sound 
is  produced  within  the  lung,  near  the  area  auscultated,  although  the  exact 
site  of  production  is  not  established. 

The  origin  of  the  expiratory  component  of  the  vesicular  sound  is  more 
obscure.  We  frequently  reasoned  that  this  sound  is  generated  either  at  the 
glottis,  as  a  result  of  passage  of  air  through  partially  adducted  vocal 
cords,  or  within  the  larger  airways,  as  a  result  of  convergence  of  air 
streams  [12,25,351. 

Since  Laennec's  [22]  time  it  has  been  known  that  pulmonary  pathology 
can  cause  a  change  in  respiratory  sound.  Differences  exist  between  normal 
and  pathological  respiratory  sounds.  In  1976,  Grassi  et  al.  [17]  used  the 
technique  of  phonopneumography  to  analyze  respiratory  sounds. 
Phonopneumography  is  defined  as  the  technique  of  detecting  and  analyzing 
the  sounds  that  are  produced  in  the  bronchopulmonary  area  during 
respiration.  The  apparatus  Grassi  et  al.  used  was  intended  to  provide  a 
graphic  recording  of  the  level  of  the  sounds  the  clinician  perceives 
through  the  stethoscope  during  auscultation.  The  phonopneumographic 
records  revealed  that  inspiration  was  louder  than  expiration  in  the  healthy 
subjects.  wherever  the  ratio  between  inspiratory  and  expiratory  peak 
amplitudes  was  found  to  be  highly  modified,  either  because  inspiration  was 
much  louder  than  normal,  or  because  an  inversion  of  the  ratio  occurred  with 
an  expiratory  sound  louder  than  inspiratory,  the  finding  was  always 
accompanied  by  pathological  alteration  of  the  pulmonary  zone. 

Chowdhury  and  Majumder  [7]  conducted  an  experiment  using  digital 
spectral  analysis  of  respiratory  sounds  for  the  purpose  of  determining  the 
clinical  relationship  between  the  frequency  spectrum  and  the  conditions  of 
the  lungs  in  pulmonary  diagnosis.  Their  study  included  6  normal  subjects 
and  6  tuberculosis  patients  with  fibrosis.  The  recordings  of  respiratory 
sounds  were  made  in  a  quiet  room  with  the  subject  in  the  supine  position 
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and  the  microphone  placed  on  the  right  lung  base.  Hie  Past  Fourier 
Transform  algorithm  of  Cooley  and  Tukey  was  used  to  obtain  the  normalized 
autospectrum  of  0.25  s  time  segments  of  respiratory  sound.  Their  analysis 
indicates  a  maxi  mum  amplitude  of  about  250  Hz  for  subjects  without 
pathological  lung  history,  with  rapid  decrease  in  amplitude  as  the 
frequency  increases  and  approaches  1000  Hz.  In  the  case  of  the  tubercular 
lung,  a  downward  frequency  shift  of  amplitude  peak  and  the  presence  of 
higher  frequency  components  were  observed. 

Charbonneau  et  al.  [6]  developed  an  index  which  they  used  to 
discriminate  between  asthmatics  and  normal  subjects.  They  calculated  the 
average  spectrum  for  inspiration  and  expiration  and  referred  to  this  as  a 
histogram.  Pour  parameters  for  both  expiratory  and  inspiratory  histograms 
were  calculated  and  the  sum  of  the  parameters  was  used  as  an  index  to 
discriminate  between  asthmatics  and  normal  subjects.  The  parameters  were: 
the  bandwidth  (taken  at  half  of  the  peak  amplitude),  the  integral  over  the 
range  60-1260  Hz,  the  highest  significant  frequency  (taken  to  be  10%  of  the 
amplitude  of  the  peak  frequency),  and  the  weighted  mean  frequency  of  each 
mean  spectra.  Their  study  included  11  normal  and  10  asthmatic  subjects. 


Problems  with  Sound  Intensity  Recorded  from  Chest  Wall 

The  correlation  of  respiratory  sound  intensity  and  the  distribution  of 
pulmonary  ventilation  was  first  studied  by  Leblanc  et  al.  [23].  Prom  their 
study  they  concluded  that  the  intensity  of  respiratory  sounds  varied  with 
lung  volume,  flow  rate,  body  orientation,  and  the  site  of  the  recording. 
O’Donnell  and  Kraman  [30]  and  Dosani  and  Kraman  [9]  conducted  studies  to 
investigate  the  intensity  patterns  of  lung  sound  on  the  chest  wall.  They 
concluded  that  there  was  a  considerable  intersubject  and  intrasubject 
variability  in  amplitude  of  the  inspiratory  vesicular  sound  heard  on  the 
chest  wall,  and  that  the  variability  was  due  to  factors  other  than  the 
distribution  of  ventilation  and  chest  wall  thickness.  These  variations 
happen  even  with  normal  subjects  without  any  diseases  of  the  lung.  They 
believed  that  the  other  factors  included  the  site  of  production  of  these 
sounds  and  their  transmission  through  the  airways  and  lung  tissue.  Dosani 
and  Kraman  [9]  pointed  out  that  the  chest  wall  thickness  may  not  have  a 
predominant  effect  on  the  intensity.  Their  results  showed  that  sound 
intensity  at  the  lateral  wall  was  similar  to  sound  at  positions  near  the 
spine  where  the  thickness  of  the  chest  wall  is  greater. 


Rationale  for  Trachea  as  Site  of  Respiratory  Sound  Detection 

The  variability  of  acoustic  properties  of  the  chest  wall  account  for 
the  variability  of  sound  intensity  as  measured  at  the  chest  wall 
[6,10,18,31].  Previous  research  indicates  that  respiratory  sounds  measured 
at  the  trachea  undergo  very  little  filtering  [6,10,18].  Charbonneau  [6] 
stated  that  the  sound  level  is  higher  at  the  trachea  than  at  any  other 
point  of  the  chest  or  back  and  the  localization  of  the  point  is  more 
precise.  Therefore,  recordings  of  the  respiratory  sound  here  were  obtained 
from  the  area  of  the  trachea. 


EXPERIMENTAL  PROCEDURES 


Air  Force  Experimental  Procedures 

Data  were  obtained  from  the  USA?  School  of  Aerospace  Medicine  under 
USAF  contract  P36615-83-D-0602,  Academic  Research  in  Biotechnology,  Task 
001,  sponsored  by  the  USAF  School  of  Aerospace  Medicine,  Brooks  AFB, 
Texas.  The  data  were  collected  by  USAF  personnel  at  Wilford  Hall  USAF 
Medical  Canter  on  patients  with  pulmonary  insufficiency  and  on  normal 
volunteers.  The  experimental  procedure  used  by  the  U.S.  Air  Force  for  data 
collection  was  as  follows: 

The  patient  was  instrumented  with  a  pulmonary  flowmeter  that  was 
comprised  of  two  Fleisch  pneumotachometers  with  a  Rudolph  valve  between 
them  and  connected  to  a  mouthpiece.  One  pneumotachometer  was  used  to 
transduce  the  inspiratory  flow  rate  and  the  other  one  was  used  to  transduce 
the  expiratory  flow  rate.  The  pneumotachometer  devices  had  pressure  taps 
that  were  connected  to  Validyne  pressure  transducers.  An  electronic 
stethoscope  was  held  at  the  anterior  cervical  triangle  for  the  detection  of 
respiratory  sounds.  The  patient  breathed  through  the  flowmeter 
exclusively.  This  method  was  accomplished  by  using  a  nose  clip  to  prevent 
nose  breathing.  A  minimum  of  5  min  of  recording  time  for  each  patient  was 
collected  to  allow  the  patient  to  become  accustomed  to  the  apparatus.  This 
procedure  was  done  to  establish  a  normal  breathing  pattern.  The 
respiratory  sounds  and  the  flow  rate  were  transduced  and  recorded  on  an  FM 
analog  tape  recorder. 


Experimental  Procedure  for  Data  Analysis 

The  magnetic  tapes  were  provided  under  contract  to  Texas  A6M 
University,  Bioengineering  Department  for  data  analysis.  Since  the 
classification  (normal /pulmonary  insufficiency)  of  the  subject,  number  of 
subjects,  and  location  of  each  subject  in  relation  to  the  time  code  was  not 
revealed  by  the  U.S.  Air  Force,  the  study  was  a  "blind  study". 

The  equipment  used  in  the  analysis  of  the  magnetic  tapes  consisted  of 
the  following  components: 

1.  Amp ex  2200  FM  analog  tape  recorder 

2.  Datum  Time  Code  Generator /Reader  Model  9300 

3.  20-Hz,  1800-Hz  low-pass  filters 

4.  100-Hz  high-pass  filter 

5.  A/D  Multiprogramner  (HP  6942A) 

6.  Tektronix  5A22N  Differential  Amplifier 

7.  Tektronix  511LA  Storage  Oscilloscope 

8.  Tektronix  2236  100-MHz  Storage  Oscilloscope 

A  block  diagram  which  illustrates  the  logical  arrangement  of  the  equipment 
used  for  analysis  of  inspiratory  and  expiratory  data  is  shown  in  Figure  1. 


Inspiratory  Data  Analysis 


As  shown  in  Figure  1,  the  respiratory  sound  signal  and  the  inspiratory 
flow-rate  signal  were  monitored  simultaneously  on  an  oscilloscope.  The 
magnetic  tape  speed  was  3-3/4  in./s.  Recordings  were  made  with  standard 
IRIG  intermediate  band  record/reproduce  amplifiers  with  a  cutoff  frequency 
(3  db)  at  19  kHz  and  a  signal-to-noise  ratio  of  35  db.  Channel  10  of  the 
FM  recorder  was  connected  to  the  time  decoder  to  display  the  recorded  time. 
The  time  decoder  was  monitored  until  a  valid  time  code  was  displayed, 
indicating  a  subject's  data  were  recorded  on  Channel  2  (inspiratory  flow 
data)  and  on  Channel  4  (respiratory  sound  data)  of  the  magnetic  tape. 
Then,  the  time  code  and  signals  were  simultaneously  monitored  and  the 
total  length  of  the  subject's  inspiratory  data  was  recorded.  The  total 
length  of  time  was  assumed  to  be  the  time  between  the  beginning  of  a  time 
code  and  the  time  when  the  time  code  cleared.  Then,  the  time  of  each 
inspiratory  breath  was  also  recorded.  A  calibration  procedure  was 
performed  on  the  inspiratory  flow-rate  signal  to  calculate  the  gain  that 
was  used.  For  a  specific  inspiratory  breath  the  signal  on  the  magnetic 
tape  was  viewed  on  an  oscilloscope  and  the  voltage  of  the  signal  was 
recorded.  The  signal  from  the  output  of  the  20-Hz  filter  for  the  same 
inspiratory  breath  was  viewed  on  another  oscilloscope  simultaneously  and 
the  voltage  of  the  signal  was  recorded.  The  gain  was  calculated  by  the 
ratio  of  the  voltage  recorded  from  the  output  of  the  20-Hz  filter  to  the 
voltage  recorded  directly  off  the  magnetic  tape.  The  value  of  the  gain  was 
necessary  in  determining  the  correct  inspiratory  flow  rate.  This  procedure 
was  done  prior  to  inspiratory  data  collection  for  each  new  subject. 

The  inspiratory  flow-rate  signal  was  taken  from  Channel  2  of  the  FM 
recorder  and  fed  into  the  20-Hz  low-pass  filter.  The  output  of  the  filter 
was  fed  into  the  oscilloscope  for  monitoring  and  the  analog  input  box  for 
manual  triggering.  The  output  of  the  analog  input  box  was  then  fed  into 
Channel  2  of  the  analog- to-digital  (A/D)  converter. 

The  inspiratory  sound  signal  was  taken  from  Channel  4  of  the  FM 
recorder  and  fed  into  a  100-Hz  high-pass  filter  followed  by  a  1800-Hz  low- 
pass  filter.  The  output  of  the  filter  was  fed  into  the  oscilloscope  for 
monitoring  and  into  Channel  1  of  the  A/D  converter.  After  calibration  of 
the  flow  data  and  random  selection  of  four  breaths,  data  collection  began. 

The  HP-9836  desktop  microcomputer  was  programmed  to  monitor  the 
inspiratory  flow  rate  in  the  A/D  channel  and  begin  data  collection  when  the 
flow  rate  reached  a  predetermined  level.  The  level  was  determined  by 
viewing  the  signal  on  the  oscilloscope.  Data  collection  was  handled  by  the 
multiplexed  A/D  channel  at  the  sampling  rate  of  8192  samples/’second.  By 
choosing  a  sampling  rate  of  8192  samples/ second,  the  result  was  to  sample 
each  analog  channel  (respiratory  sound  and  flow  rate)  at  4096 
samples/ second.  The  program  was  modified  to  be  interactive;  therefore,  it 
allowed  for  versatility  in  the  processing  of  recorded  data  with  large 
variabilities.  The  program  prompted  the  operator  for  changes  of  the  flow- 
rate  trigger  level,  the  duration  of  data  collection,  the  number  of 
repetitions,  the  data  file  name,  the  data  file  structure,  and  the  sampling 
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rate.  The  program  also  had  additional  flexibility  allowing  the  operator  to 
decide  whether  or  not  he/ she  wanted  to  save  the  collected  data. 

The  voltage  level  at  which  the  A/D  converter  started  collecting  data 
was  displayed  on  the  screen.  If  the  level  was  acceptable  with  the  trigger 
level  that  was  set  (within  0.2  V),  then  the  data  were  saved  and  stored  on 
the  diskette.  The  data  file  names  used  were  the  respective  time  codes  of 
the  breaths  collected.  This  procedure  continued  until  four  inspiratory 
breaths  of  each  subject  had  been  collected,  digitized,  and  stored  on 
diskettes  for  further  signal  processing. 


Expiratory  Data  Analysis 

A  block  diagram  which  illustrates  the  logical  arrangement  of  the 
equipment  used  for  analysis  of  expiratory  data  is  shown  in  Figure  1.  As 
shown  in  the  figure,  both  the  expiratory  flow  rate  signal  and  the 
expiratory  sound  signal  were  monitored  simultaneously  on  an  oscilloscope. 
The  same  procedure  for  locating  the  inspiratory  data  was  used  to  locate 
expiratory  data. 

A  calibration  procedure  was  then  performed  on  the  expiratory  flow-rate 
signal  to  calculate  the  gain  that  was  used.  For  a  specific  expiratory 
breath,  the  signal  off  the  magnetic  tape  was  viewed  on  an  oscilloscope  and 
the  voltage  of  the  signal  was  recorded.  The  signal  from  the  output  of  the 
20-Hz  filter  for  the  same  expiratory  breath  was  viewed  on  an  oscilloscope 
and  the  voltage  of  the  signal  was  recorded.  The  gain  was  calculated  by  the 
ratio  of  the  voltage  recorded  from  the  output  of  the  20-Hz  filter  to  the 
voltage  recorded  directly  off  the  magnetic  tape.  The  value  of  the  gain  was 
necessary  in  determining  the  correct  expiratory  flow  rate.  This  procedure 
was  done  prior  to  expiratory  data  collection  for  each  new  subject. 

The  expiratory  flow-rate  signal  was  taken  from  Channel  3  of  the  FM 
recorder  and  fed  into  the  differential  amplifier  of  the  oscilloscope.  The 
differential  amplifier  was  used  to  balance  the  DC  offset  by  adjusting  the 
DC  step  attenuation  balance  and  the  position  knob.  The  filter  setting  was 
DC.  This  step  was  needed  because  the  expiratory  flow  transducer  that  was 
used  had  a  1.4  V  DC  offset.  The  output  of  the  oscilloscope  was  fed  into 
the  input  of  the  20-Hz  filter,  and  the  output  of  the  20-Hz  filter  was  fed 
into  both  the  oscilloscope  for  monitoring  and  the  analog  input  box  for 
manual  triggering.  The  output  of  the  analog  input  box  was  then  fed  into 
Channel  2  of  the  A/D  converter .  The  respiratory  sound  signal  was  taken 
from  Channel  4  of  the  FM  recorder  and  fed  into  a  100-Hz  high-pass  filter 
followed  by  a  1800-Hz  low-pass  filter.  The  output  of  the  filter  was  fed 
into  the  oscilloscope  for  monitoring  and  into  Channel  1  of  the  A/D 
converter . 

The  same  procedure  of  expiratory  data  collection  employing  the  HP-9836 
microcomputer  and  A/D  converter  as  noted  for  the  inspiratory  data 
collection  was  used.  This  procedure  continued  until  four  expiratory 
breaths  of  each  subject  had  been  collected,  digitized,  and  stored  on 
diskettes  for  further  signal  processing. 
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QUANTITATIVE  ANALYSIS 


The  Fourier  series  is  used  to  represent  arbitrary  periodic  functions  by 
an  infinite  series  of  sinusoids  of  harmonically  related  frequencies  to 
study  the  time  domain  responses  in  networks.  The  Fourier  series  expressed 
as  a  linear  combination  of  harmonically  related  complex  exponentials  can  be 
written  in  the  form: 

x(t)  »  /  dt  (1) 

a*  -  1/T0  /  xctje'3^1  dt  (2) 

where  k  *  1,  2, . . . ,  ». 

Equation  (1)  is  often  referred  to  as  the  synthesis  equation  and 
equation  (2)  as  the  analysis  equation.  The  coefficients,  ak,  are  often 
called  the  Fourier  series  coefficients  or  the  spectral  coefficients  of 
x(t).  These  complex  coefficients  measure  the  portion  of  the  signal  x(t) 

that  is  at  each  harmonic  of  the  fundamental  component.  The  fundamental 

frequency  is  defined  as  u0,  and  the  fundamental  period  is  T0»2jt/u0. 

The  Fourier  series  method,  however,  has  limitations  in  analyzing  linear 
systems  for  the  following  reasons:  1)  The  Fourier  series  can  be  used  for 
inputs  which  are  periodic;  however,  most  inputs  in  practice  are 
nonperiodic.  2)  The  method  applies  only  to  systems  that  sure  stable.  A 

stable  system  is  a  system  whose  natural  response  decays  in  time. 

The  first  limitation  can  be  overcome  since  we  can  represent  the  non¬ 
periodic  input  in  terms  of  exponential  components.  A  method  of 

accomplishing  this  function  is  the  Fourier  Transform.  For  instance, 
consider  the  nonperiodic  function  f(t)  in  Figure  2  which  we  would  like  to 
represent  in  terms  of  exponential  components.  To  do  this,  we  constructed  a 
periodic  function  fT(t)  in  Figure  2  with  a  period  T,  where  the  function 
f(t)  is  repeated  every  T  seconds.  The  period,  T,  is  considered  large 
enough  so  there  is  not  any  overlap  between  pulse  shapes  of  f(t).  The  new 
function  is  a  periodic  function  and  can  be  represented  with  an  exponential 
Fourier  series  as  follows: 


fT(t)  =  ZFne3nwor 

(3) 

where  u0*  2jt/T, 

(4) 

and  Fn  »  1/T  2  fT(t)e'ln"ot 

(5) 

The  next  process  is  to  evaluate  the  function  as  the  period  increases  to 
infinity.  As  T  becomes  infinite,  the  pulses  repeat  after  an 
infinite  interval.  Therefore,  fT(t)  and  f(t)  are  identical  in  the  limit, 
and  the  Fourier  series  representing  the  periodic  function  f,(t)  will  also 
represent  f(t). 

In  the  limit,  as  T  approaches  infinity,  u  approaches  zero.  Therefore, 
w0  can  be  denoted  as  5<j.  Then:  T  =  2ir/uQ=2ir/ Su0  and, 


(6) 


TFn=  ZfT(t)e':)ni“t 
TFn  is  a  function  of  jnSu,  so  let  TF*F(jnSu).  Equation  (3)  becomes: 


Z  Fne^o1 

(7) 

Z[F(  jn5w)/T]e(jn6“)t 

(8) 

Z{[F(jn&j)/2»]5u}  e(  3a4w)t 

(9) 

In  the  limit,  as  T  approaches  infinity,  U>  approaches  zero,  and  f7(t) 
approaches  f ( t ) .  Then : 

f(t)  *  Lim  fT(t)=[Lim  l/2»  F(jn5«j)  eJn6wt5w]  (10) 


is  by  definition: 

f(t)  -  l/2v  /  F(ju)ejwt  do  (11) 

Equation  (5)  is  known  as  the  inverse  Fourier  transform.  Recall 
equation  ( 5 ) : 


Pn  -  1/T  /  fT(t)e‘jnfc,ot  dt 

(12) 

■  F(jn6u)/T 

(13) 

then 

F(jw)  -  Lim  /  fT(t)e"jn*"tdt 

(14) 

F(ju)  -  /  f (t)e_5wt  dt 

(15) 

Equation  (15) 

is  known  as  the  direct  Fourier  transform. 

The  result  is 

F(ju)  -  I  f(t)e'3wt  dt 

(16) 

is  the  representation  of  the  nonperiodic  function  f(t)  in  terms  of 
exponential  functions.  The  amplitude  of  the  component  of  any  frequency  w 
is  proportional  to  F(jo).  In  analogy  with  the  terminology  used  for  the 
Fourier  series  coefficient  of  a  periodic  signal,  the  transform  F(ju)  of  an 
aperiodic  signal  f(t)  is  commonly  referred  to  as  the  spectrum  of  f(t),  as 
it  provides  us  with  the  information  concerning  how  f(t)  is  composed  of 
sinusoidal  signals  at  different  frequencies. 

Pigure  3  illustrates  the  steps  discussed  in  the  following  section.  The 
signal  was  conditioned  by  the  use  of  the  analog  filters  previously 
discussed  in  the  experimental  procedures.  The  100-Hz  high-pass  filter  and 
the  1800-Hz  low-pass  filter  were  thus  acting  together  as  a  band-pass 
filter.  The  respiratory  sound  data  and  flow  data  were  digitized,  separated 
using  the  separation  program,  and  stored  on  diskettes  for  the  FFT  analysis. 
The  respiratory  sound  signal  was  digitized  at  a  rate  of  4096 
samples/ second.  The  sampling  rate  was  chosen  to  avoid  "aliasing"  of  the 
spectra.  The  literature  indicates  that  the  highest  frequency  of  normal 
respiratory  sound  is  about  1500  Hz,  and  this  may  be  higher  for  abnormal 
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subjects.  The  Nyquist  sampling  criteria  require  that  the  Nyquist 
frequency  be  at  least  twice  the  highest  frequency  of  interest  of  the  signal 
being  sampled.  The  first  1024  samples,  which  correspond  to  0.25  s  of  real 
time  data,  were  transformed  by  the  Cooley-Tukey  Fast  Fourier  Transform 
(FFT)  algorithm. 

When  only  a  finite  segment  of  the  signal  is  observed,  the  process  is 
equivalent  to  multiplying  the  signal  by  a  rectangular  window  function.  In 
the  frequency  domain,  this  multiplication  becomes  a  convolution  between  the 
desired  spectrum  and  that  of  the  window.  As  a  result,  the  frequency 
spectrum  is  distorted,  and  the  spectral  components  "leak"  away  from  their 
time  frequencies  and  are  distributed  over  the  total  spectrum  [34].  A 
rectangular  window  function  is  not  accurate  in  describing  the  signal  and 
therefore  produces  signal  discontinuities  at  the  boundaries.  The  Fourier 
transform  will  add  all  harmonics  to  simulate  the  fast  rising  edge  of  the 
window,  and  this  is  inaccurate  in  representing  the  signal.  A  suitable 
window  function  needs  to  be  chosen  to  minimize  the  "spectral  leakage The 
window  chosen  was  a  10%  cosine  tapered  window.  This  window  forces  the 
first  data  point  to  zero,  and  the  rising  edge  of  the  window  is  a  cosine 
function.  The  equations  for  the  cosine  taper  are: 


0.5<l-cos(10*pi*I/N)) 

for  0SISN/10 

(17) 

1 

for  N/10<I<N*9/10 

(18) 

and 

0.5(l-cos(10*pi*I/N) ) 

for  N*9/10£I2ai 

(19) 

where 

used 

pi*3. 1415926,  I  is  the  index,  and  N  is  the  number  of 
for  the  FFT  calculation  (in  this  case  1024). 

samples  being 

Calculation  of  Parameters 


Thre_  indices  of  measure  were  calculated  from  the  power  spectrum. 
These  parameters  were  the  mean  frequency  of  the  power  spectrum  (MPF),  the 
frequency  of  the  maximum  power  (FPK),  and  the  highest  frequency  at  which 
the  power  in  the  spectrum  equals  or  is  less  than  10%  of  the  maximum  power 
(FMAX) .  The  rationale  behind  selection  of  these  parameters  is  as  follows: 

(1)  In  descriptive  statistics,  the  mean  (MPF)  corresponds  to  the 
central  tendency  of  the  distribution. 


(2)  The  peak  value  (FPK)  or  the  mode  of  a  distribution  corresponds  to 
the  most  frequent  occurrence  of  an  event,  in  this  case  the  power 
within  the  respiratory  sound  spectrum. 


(3)  The  highest  frequency  (FMAX)  at  which  the  power  in  the  spectrum 
becomes  10%  of  the  peak  power  corresponds  to  a  rough  bandwidth  from 
DC  to  the  frequency  at  which  the  power  contents  remain  10  db  below 
maximum  power  in  the  spectrum.  This  parameter  may  also  be  used  as  an 
indicator  of  the  correctness  in  the  selection  of  the  Nyquist  sampling 
frequency.  Also,  Charbonneau  et  al.  [6]  used  the  FMAX  parameter  as 
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part  of  a  linear  function  to  discriminate  normal  from  asthmatic 
patients . 

The  mean  frequency  of  the  power  spectra  was  calculated  from  the 
equation : 

MPF  ■  Z(C12*Fi)/ZCt2  (20) 

where  i  is  the  index  0,  1,  2, . ,  N-l.  Fi  is  the  frequency  at  index  i, 

and  C1  is  the  Fourier  coefficient  at  index  i.  The  coefficient  Ch  is 
computed  from  the  FFT: 

Ct  *  (a12+bi2)1/2  (21) 

where  a<  is  the  real  component  of  the  Fourier  coefficient  at  index  i,  and 
is  the  imaginary  component  of  the  Fourier  coefficient  at  index  i. 

The  next  parameter  calculated  by  the  program  is  the  frequency  of  the 
maximum  power  of  the  power  spectrum.  A  sorting  routine  is  used  to  locate 
the  maximum  power  and  the  index  at  which  the  maximum  power  occurred.  The 
peak  frequency  then  becomes 

FP  =  I*FR  (22) 

where  I  is  the  index  0,2,..., N-l,  and  FR  is  the  frequency  resolution.  The 
frequency  resolution  (FR)  is  calculated  from  the  period  or  window  length  in 
time,  that  is: 

FR  »  1/(0. 25  sec)  *  4  Hz  (23) 

The  final  parameter,  the  highest  frequency  at  10%  of  the  maximum  power 
(FMAX) ,  is  calculated  in  a  manner  similar  to  the  calculation  of  the 
frequency  of  the  maximum  power.  Again,  a  sorting  routine  is  used  to  find 
the  index  at  which  the  power  becomes  10%  of  the  maximum  power .  Then  the 
frequency  is  calculated  from  the  index. 

As  previously  mentioned,  a  FFT  was  performed  on  the  respiratory  sound 
data  and  power  spectra  plots  were  obtained.  Several  plots  of  the  power 
spectra  for  both  expiration  and  inspiration  have  been  included  for 
illustration.  As  shown  on  the  following  figures,  the  percentage  of  maximum 
power  was  plotted  on  the  vertical  axis  and  the  frequency  was  plotted  on  the 
horizontal  axis.  The  plots  show  the  power  spectra  analysis  of  each  0.25  s 
for  the  entire  length  of  time  specified  during  data  collection.  The  time 
interval  varied  between  1.5  s  to  2.5  s. 


Inspiratory  and  Expiratory  Power  Spectra  Plots 

Figure  4  illustrates  a  plot  of  the  power  spectra  analysis  of  the 
inspiration  from  one  subject.  This  plot  appears  to  indicate  a  bimodal 
pattern  of  the  power  spectra.  The  breath  ,  sounds  for  this  specific  time 
interval  are  depicted  in  the  upper  right-hand  corner.  Figure  5  illustrates 


16 


a  plot  of  both  the  respiratory  sound  data  and  the  inspiratory  flow  rate.  A 
plot  of  this  type  was  done  for  every  inspiratory  and  expiratory  respiratory 
sound  analyzed.  The  purpose  was  to  determine  the  0.25  intervals  of  the 
power  spectra  that  correspond  to  either  inspiration  or  expiration.  Since 
Figure  4  was  the  power  spectra  analysis  of  an  inspiratory  breath,  Figure  5 
was  used  to  determine  which  0.25  s  intervals  corresponded  to  the 
inspiratory  respiratory  sounds.  Intervals  3  through  7  corresponded  to  the 
inspiratory  respiratory  sounds;  therefore,  the  indices  of  the  power  spectra 
for  these  5  intervals  were  tabulated  for  use  in  the  cluster  analysis  of  the 
data.  This  procedure  was  followed  for  every  respiratory  sound  (expiration, 
inspiration;  that  was  analyzed  to  ensure  that  the  indices  of  the  power 
spectra  analysis  for  expiration  and  inspiration  were  kept  separate  for  the 
cluster  analysis  of  the  data. 

A  plot  of  expiratory  sound  power  spectra  is  included  for  illustration. 
Figure  6  illustrates  a  possible  bimodal  pattern  of  the  power  spectra  for 
the  expiration  cycle  of  this  particular  subject.  Figure  7  illustrates  the 
expiratory  cycle  and  the  intervals  chosen  for  the  cluster  analysis.  We 
noted  that  the  expiratory  respiratory  sounds  appear  to  be  more  forceful  at 
the  start  of  the  cycle  and  then  taper  off.  These  plots  illustrate  a 
variety  in  the  power  spectra  which  may  indicate  a  difference  of  breathing 
patterns  and/or  a  difference  in  subjects.  The  bimodal  patterns  of  the 
power  spectra,  illustrated  in  some  of  the  figures,  indicate  that  the  type  of 
analysis  chosen  may  need  modifications  to  handle  this  type  of  power  spectra 
arrangement . 


CLUSTER  ANALYSIS 


The  previously  mentioned  quantitative  analysis  of  the  data  resulted  in 
three  parameters  of  the  power  spectra  for  each  0.25  s  observation.  These 
parameters  are  MPF,  FPK,  and  FMAX.  Expiratory  and  inspiratory  data  can  be 
found  in  Appendix  E.  Since  the  study  is  a  blind  study,  the  best  way  to 
approach  the  test  of  the  hypothesis  is  by  cluster  analysis. 

Cluster  analysis  is  used  when  the  classification  of  the  observations  is 
not  known.  The  objective  of  cluster  analysis  is  to  determine  whether  or 
not  the  data  fall  into  separate  clusters  or  groups.  The  very  concept  of 
"cluster"  is  a  subjective  matter.  One  ordinarily  thinks  of  a  cluster  as  a 
set  of  objects  which  are  all  close  together.  Examples  may  include  a 
sunburst  or  a  group  of  people  at  a  party.  A  set  of  objects  arranged  along 
a  straight  line  would  not  be  described  as  a  cluster  in  the  ordinary 
connotation  of  the  word. 


Ftatisical  Analysis  System  (SAS)  is  a  computer  software  system  for  data 
analysis.  A  SAS  data  set  contains  not  only  the  data  values,  but  also  such 
information  as  variable  names,  labels,  and  formats. 


The  procedure  used  for  cluster  analysis  was  the  FASTCLUS  procedure. 
FASTCLUS  performs  a  cluster  analysis  on  the  basis  of  Euclidean  distances 
computed  from  one  or  more  quantitative  variables.  The  analysis  is  a 
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disjoint  cluster  analysis,  meaning  that  every  observation  belongs  to  one 
and  only  one  cluster.  FASTCLUS  is  intended  for  use  with  large  data  sets 
(about  100  to  100,000  observations),  and  uses  a  method  that  is  sometimes 
referred  to  as  the  nearest  centroid  sorting.  FASTCLUS  is  an  iterative 
algorithm  for  minimizing  the  sum  of  squared  distances  from  the  cluster 
means.  Clusters  are  formed  such  that  all  the  Euclidean  distances  between 
observations  in  the  same  cluster  sure  less  than  all  Euclidean  distances 
between  observations  in  different  clusters.  The  Euclidean  distance  between 
two  points  P1(x1,y1)  and  P2(x2,y2)  can  be  written  as: 

d2  »  (x2-x1)2+(y2-yi)2  (24) 
where  x  and  y  are  the  coordinates  of  the  points. 

FASTCLUS  always  selects  the  first  complete  (no  missing  values) 
observation  as  the  first  seed.  A  seed  refers  to  the  mean  of  a  cluster. 
The  number  of  seeds  selected  corresponds  to  the  number  of  clusters 
specified  in  the  procedure  statement.  The  next  complete  observation  that 
is  separated  from  the  first  seed  by  at  least  the  RADIUS  becomes  the  second 
seed,  and  so  forth  until  the  desired  number  of  seeds  are  chosen.  The 
default  value  of  zero  was  used  for  RADIUS.  RADIUS  establishes  the  minimum 
distance  criterion  for  selecting  new  seeds.  Two  tests  are  made  to  see  if 
an  observation  can  qualify  as  a  new  seed.  First,  an  old  seed  is  replaced 
if  the  distance  between  the  two  closest  seeds  is  less  than  the  distance 
from  the  observation  to  the  nearest  seed.  The  seed  that  is  chosen  to  be 
replaced  is  selected  from  the  two  seeds  that  are  closest  to  each  other,  and 
it  is  the  seed  that  is  also  closest  to  the  observation.  If  this  test 
fails,  a  second  test  is  performed.  The  observation  will  replace  the 
nearest  seed  if  the  smallest  distance  from  the  observation  to  all  seeds 
other  than  the  nearest  one  is  greater  than  the  shortest  distance  from  the 
nearest  seed  to  ail  other  seeds.  If  this  test  also  fails,  FASTCLUS  goes  on 
to  the  next  observation.  Each  observation  is  assigned  to  the  nearest  seed 
to  form  temporary  clusters.  The  seeds  are  then  replaced  by  the  means  of 
the  temporary  clusters  and  the  process  is  repeated  until  no  further  changes 
occur  in  the  clusters. 


RESULTS  AND  DISCUSSION 

The  SAS  FASTCLUS  program  was  used  to  obtain  clusters  of  10,  5,  4,  3, 
and  2  groups  from  the  respiratory  spectral  data.  In  all,  14  cases  with  5 
selections  of  groupings  resulted  in  70  cases-groups  being  analyzed.  Cases 
1  through  6  are  two-dimensional  plots  of  paired  variables  (i.e.,  MPF  vs 
FLOW,  FPK  vs  FLOW,  etc.)  during  expiration.  Table  1  summarizes  the  cases 
for  both  inspiratory  and  expiratory  data.  Cases  8  through  13  are  the  two- 
variable  cluster  plots  during  inspiration.  Cases  7  and  14  present  three- 
dimensional  cluster. 
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Table  1.  SUMMARY  OF  CLUSTER  ANALYSES  OF  INSPIRATORY  AND  EXPIRATORY  DATA 


r 

5 


Mode 

Case 

Cluster  variables 

Expiration 

1 

MPF*FLOW 

Expiration 

2 

FPK*FLOW 

Expiration 

3 

FMAX*FLOW 

Expiration 

4 

MPF*FPK 

Expiration 

5 

FPK*FMAX 

Expiration 

6 

FPK*FMAX 

Expiration 

7 

MPF*FPK*FMAX 

Inspiration 

8 

MPF*FLOW 

Inspiration 

9 

FPK*PL0W 

Inspiration 

10 

FMAX*FLOW 

Inspiration 

11 

MPF*FPK 

Inspiration 

12 

MPP'FMAX 

Inspiration 

13 

FPK*FMAX 

Inspiration 

14 

MPF"FPK*FLOW 

FLOW  is  respiratory  flow-rate  in  liters  per 


Max,  cluster 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

10,5,4,3,2 

second. 
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Expiratory  Cluster  Analysis 


Figures  8  and  9  illustrate  cluster  attempts  for  Case  1  with  10  groups 
and  5  groups  respectively.  The  numbers  on  the  figure  represent  the  cluster 
group  to  which  the  data  point  is  assigned.  From  inspection,  no 

distinguishable  clusters  appear  in  Figures  8  or  9.  Figure  10  illustrates 
the  clustering  of  Case  1  data  into  4  groups.  Boundaries  between  the  groups 
appear  at  values  of  the  mean  frequency  at  260,  480,  and  750  Hz.  Figure  11 
represents  clustering  of  Case  1  data  into  3  groups.  The  boundaries  appear 
at  values  of  the  mean  frequency  at  310  and  640  Hz.  The  boundary  between 
groups  2  and  3  is  not  well  defined,  with  some  data  points  overlapping  in 
the  2  cluster  groups.  Figure  12  shows  clustering  of  Case  1  data  into  2 
groups.  The  rationale  for  the  2  groups  is  to  have  1  cluster  group 
represent  normal  subjects  and  the  other  group  represent  abnormal  subjects 
with  pulmonary  insufficiency.  The  boundary  between  the  2  groups  appears  to 
be  at  a  mean  frequency  of  510  Hz.  Group  2  is  not  a  clearly  defined  cluster 
group.  From  Figure  12,  it  appears  that  2  groups  are  not  enough  to  cluster 
the  expiratory  data  by  the  variables  MPF  and  FLOW. 

Clustering  the  data  by  the  variable  MPF  and  FLOW  does  not  appear  to 
separate  the  data  into  distinguishable  clusters.  The  3  clusters  in  Figure 
11  appear  to  have  a  sharper  boundary  drawn  between  clusters  than  the  other 
combinations  tried;  however,  it  does  not  appear  to  be  3  distinct  clusters. 

In  all  the  two-dimensional  clusters  cases,  clustering  into  10  or  5 
groups  results  in  3  distinct  cluster  groupings.  This  result  indicates  that 
trying  to  cluster  into  more  than  4  groups  is  too  many.  Also,  clustering 
into  2  groups  indicates  the  number  of  groups  selected  are  not  enough. 
Therefore,  in  the  discussion  that  follows  only  clustering  of  the  data  into 
4  and  3  groups  will  be  presented. 

Figure  13  presents  clustering  into  4  groups  for  Case  2.  The  figure 
seems  to  indicate  3  separate  groups.  The  data  points  in  groups  1  and  2 
appear  to  belong  into  a  single  group.  Figure  14  illustrates  clustering  of 
Case  2  data  into  3  groups.  The  boundaries  between  groups  appear  at  mean 
frequency  values  of  820  and  380  Hz.  In  summary,  separating  the  data  into 
clusters  by  the  variables  FPK  and  FLOW  (Case  2)  appears  to  show  3  separate 
clusters . 

Case  3  is  an  attempt  to  cluster  the  data  by  the  variables  FMAX  and 
FLOW.  From  this  point  on,  plots  for  10  and  5  groups  may  be  found  in 
Appendixes  A  and  B,  respectively,  since  in  all  cases  these  clusters  seem  to 
intermingle,  indicating  that  the  number  of  clusters  chosen  is  too  high. 
Likewise,  cluster  plots  for  2  groups  may  be  found  in  Appendix  C,  since  in 
all  cases  there  appear  to  be  more  than  2  groups.  Thus,  in  the  interest  of 
brevity,  only  cluster  plots  of  4  and  3  groups  will  be  presented  and 
discussed.  Figure  15  illustrates  the  clustering  of  4  groups.  Two  cluster 
groups  are  above  a  FMAX  value  of  900.  The  other  two  boundaries  may  be 
drawn  at  FMAX  values  of  425  and  1225.  Figure  16  is  a  plot  of  3  group 
clusters.  Thus,  separating  the  data  by  variables  FMAX  and  FLOW  appears  to 
cluster  the  data  into  3  groups.  In  Case  4  the  data  are  clustered  by  the 
variables  MPF  and  FPK.  Figure  17  is  a  plot  of  4  clusters  for  the  variables 


Figure  8.  Cluster  plot  of  Case  1,  expiration,  with  10  cluster 
groups  (Maxc=10). 


Figure  9.  Cluster  plot  of  Case  1,  expiration,  with  5  cluster 
groups  (Maxc=5). 


Figure  11.  Cluster  plot  of  Case  1,  expiration,  with  3  cluster 
groups  (Maxc=3). 


Figure  13.  Cluster  plot  of  Case  2,  expiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  15.  Cluster  plot  of  Case  3,  expiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  16.  Cluster  plot  of  Case  3,  expiration,  with  3  cluster 
groups  (Maxc=3). 


Figure  17.  Cluster  plot  of  Case.  4,  expiration,  with  4  cluster 
groups  (Maxc=4). 


MPF  and  FPK.  This  figure  appears  to  indicate  4  separate  cluster  groups. 
Clusters  2  and  3  appear  to  be  dense  and  distinctly  separate  from  each 
other.  Cluster  4  is  separate  from  the  other  3  clusters,  with  a  very  high 
value  for  MPF  and  FPK.  Cluster  1  appears  to  be  dense  with  respect  to  FPK; 
however,  it  is  not  as  dense  of  a  cluster  as  are  2  and  3.  The  boundaries 
for  the  4  separate  clusters  are  as  follows: 


Cluster 

1 

440 

MPF 

860 

88 

FPK  413 

Cluster 

2 

100 

MPF 

420 

50 

FPK  338 

Cluster 

3 

200 

MPF 

600 

325 

FPK  775 
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Figure  18 
to  be  at 
variables 

is  a  plot  of  MPF  vs.  FPK  for  3  cluster  groups.  Cluster  1 
least  2  separate  clusters.  Thus,  separating  the  data 

MPF  vs.  FPK  appears  to  cluster  the  data  into  4  groups. 

appears 
by  the 

Case  5  clusters  the  data  by  the  variables  MPF  and  FMAX.  Figure  19  is  a 
plot  of  MPF  vs.  FMAX  for  a  maximum  of  4  clusters.  This  plot  shows  4 

clusters  with  boundaries  with  some  overlapping  data  points  near  the 
boundaries. 

Cluster 

1 

640 

MPF 

980 

1050 

FMAX  1500 

Cluster 

2 

100 

MPF 

580 

100 

FMAX  413 

Cluster 

3 

100 

MPF 

560 

388 

FMAX  838 

Cluster 

4 

320 

MPF 

640 

950 

FMAX  1475 

Figure  20 

is  a 

cluster  plot 

Of  MPF  vs.  FMAX 

for  clusters  of  3 

groups . 

From  this  plot  it  appears  as  if  the  data  is  clustered  into  3  groups  along 
the  FMAX  scale.  Thus,  separating  the  data  by  both  variables  MPF  and  FMAX 
appears  to  cluster  the  data  into  4  groups. 

In  Case  6  the  data  is  clustered  by  variables  FPK  and  FMAX.  Figure  21 
is  a  plot  of  FPK  vs.  FMAX  for  a  maximum  cluster  of  4.  This  plot  shows  4 
cluster  groups  with  boundaries  as  follows: 
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Figure  22  shows  the  cluster  plot  of  FPK  vs.  FMAX  for  3  cluster  groups. 
This  plot  appears  to  cluster  the  data  along  the  FMAX  scale  with  the 
boundary  at  FMAX  equal  to  a  value  of  900.  Cluster  group  2  in  Figure  22 
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Figure  19.  Cluster  plot  of  Case  5,  expiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  20.  Cluster  plot  of  Case  5,  expiration,  with  3  cluster 
groups  (Maxc=3 ) . 


Figure  21.  Cluster  plot  of  Case  6,  expiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  22.  Cluster  plot  of  Case  6,  expiration,  with  3  cluster 
groups  (Maxc=3). 


appears  to  be  2  cluster  groups,  instead  of  1  as  shown.  Thus,  separating 
the  data  by  both  variables  FPK  and  FMAX  appears  to  cluster  the  data  into  4 
groups.  As  noted,  the  previous  6  cases  studied  are  not  consistent  in  the 
number  of  cluster  groupings  which  are  apparent.  Some  combinations  of 
parameters  appear  to  indicate  3  cluster  groups  while  other  combinations 
appear  to  show  4  clusters.  Some  combinations  do  not  reveal  any  distinct 
clusters  at  all. 

Case  7  is  an  attempt  to  give  a  clearer  indication  of  the  relationship 
among  the  3  indices  of  the  power  spectra  (MPF,  FPK  and  FMAX)  by  a  three- 
dimensional  cluster  plot. 

An  IBM  XT  microcomputer  and  graphics  printer  were  used  to  construct 
three-dimensional  plots  of  the  USAF  data  and  data  from  a  previous  study  by 
Wong  [36].  Wong’s  study  was  a  power  spectral  analysis  of  respiratory 
sounds  at  the  trachea  of  normal  young  men.  Since  USAF  data  contained 
normal  subjects  and  pulmonary  insufficiency  patients,  Wong's  population  of 
subjects  was  used  as  a  comparison. 

A  scale  was  chosen  according  to  the  maximum  value  of  the  3  parameters 
of  the  power  spectra  of  the  USAF  data.  This  scale  was  used  for  both  the 
USAF  data  plots  and  for  the  normal  subjects  of  Wong's  study.  A  grid 
pattern  was  set  up  and  the  data  was  entered  at  the  intersection  of  the  grid 
lines.  The  plots  are  an  approximation  of  a  three-dimensional  view.  Three 
various  angles  were  used  to  obtain  a  clear  picture  of  the  relationship 
between  the  parameters: 
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A  boundary  was  drawn  around  the  points  in  the  three-dimensional  plot  of 
normal  subjects.  The  individual  plots  of  normal  subjects  and  USAF  data  for 
the  angles  noted  before  are  found  in  Appendix  D.  A  plot  of  the  normal 
subjects  superimposed  onto  the  plot  of  the  USAF  data  is  included  for 
discussion.  The  normal  subjects  of  Wong's  study  are  surrounded  by  the 
boundary.  In  Figure  23,  the  normal  subjects  appear  to  be  at  an  angle  from 
the  vertical,  while  other  data  points  appear  to  follow  more  of  a  vertical 
line.  Figure  24  does  not  appear  to  show  any  distinction  among  the  data 
points.  Figure  25  appears  to  show  3  groups.  The  normals  are  centrally 
clustered  and  there  are  data  points  clustered  both  to  the  right  of  the 
normals  and  some  to  the  lower  left  of  the  cluster. 

Figures  23  and  25  show  an  appearance  of  a  possible  clustering  of  the 
data  into  separate  groups.  The  indication  of  these  plot-  is  that  just  1  or 
2  parameters  is  probably  not  enough  to  be  able  to  distinctly  separate  the 
data  into  groups. 
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Inspiratory  Cluster  Analysis 


The  remaining  part  of  the  discussion  deals  with  cases  8  through  14  which 
are  the  combinations  of  variables  used  to  cluster  the  inspiratory  data  in 
various  number  of  grouping.  The  number  cluster  groups  used  in  the  program 
are  10,  5,  4,  3,  and  2.  As  previously  discussed  with  the  expiratory  data, 
clustering  of  the  data  into  10  and  5  groups  appears  to  be  too  large.  The 
plots  either  reveal  no  distinguishable  cluster  groups  or  there  are  only  a 
few  members  in  a  cluster  group.  Therefore,  the  number  of  cluster  groups 
(i.e.,  10  and  5  groups)  are  not  the  optimum  number  for  the  data  being 
analyzed.  Plots  of  the  10  and  5  cluster  groups  for  the  cases  8  through  14 
are  found  in  Appendixes  A  and  B  respectively.  Discussion  of  these  results 
has  been  omitted  in  the  interest  of  brevity. 

Case  8  is  a  plot  of  MPF  vs.  FLOW.  Figure  26  illustrates  clustering 
into  4  cluster  groups.  The  boundaries  between  the  clusters  are  not 
distinct  and  the  plot  does  not  show  separate  distinct  groupings.  For 
example,  cluster  2  is  a  very  scattered  group  with  some  of  the  2's  found 
very  close  to  the  4th  cluster.  Cluster  3  is  relatively  dense  with  a  mean 
flow  value  less  than  1.8  1/s;  however,  the  plot  shows  the  cluster  also 
including  data  with  very  high  mean  flow  values.  The  boundaries  appear  to 
be  oriented  horizontally  with  the  main  emphasis  on  the  MPF  parameter.  For 
this  case,  clusters  of  the  data  are  not  well  defined. 

Figure  27  illustrates  the  attempt  to  make  3  clusters.  The  boundaries 
in  this  plot  also  appear  to  be  oriented  horizontally  with  the  main  emphasis 
on  the  MPF  parameter.  The  boundaries,  however,  have  intermingling  data 
points  and  the  plot  does  not  show  3  distinct  clusters. 

Figure  28  illustrates  clustering  of  the  inspiratory  data  into  2  groups. 
The  boundary  is  drawn  at  a  MPF  of  550,  with  anything  above  the  value  being 
defined  as  cluster  2  and  anything  below  the  value  being  defined  as  cluster 
1.  The  boundary  is  not  well  defined  with  intermingling  of  data  points,  and 
the  clusters  are  not  2  clearly  separate  groups.  Clustering  the  data  by  the 
variables  MPF  arid  FLOW  does  not  appear  to  separate  the  data  into 
distinguishable  clusters.  This  agrees  with  the  result  of  using  the 
variables  MPF  and  FLOW  to  cluster  the  expiratory  data. 

Case  9  clusters  the  data  by  the  variables  FPK  and  FLOW.  Figure  29 
illustrates  clustering  of  the  data  into  4  groups.  The  data  appear  to 
separate  into  3  groups  instead  of  4.  Clusters  2  and  3  appear  to  belong 
together  and  be  separate  from  the  other  2  clusters.  Figure  30  is  a  plot  of 
FPK  vs.  FLOW  for  3  cluster  groups.  This  plotting  appears  to  be  a  better 
cluster  than  the  plot  of  4  clusters.  The  boundaries  drawn  between  the  3 
clusters  are  at  FPK  values  of  350  and  950.  In  Figure  31,  it  appears  as  if 
there  are  3  cluster  groups  with  cluster  group  1  actually  being  2  separate 
groups  clustering  into  2  groups.  Clustering  the  data  by  the  variables  FPK 
and  FLOW  appears  to  separate  the  data  into  3  separate  clusters.  This 
grouping  agrees  with  the  number  of  clusters  found  for  FPK  and  FLOW  in 
expiratory  data. 


Figure  26.  Cluster  plot  of  Case  6,  inspiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  28.  Cluster  plot  of  Case  8,  inspiration,  with  2  cluster 
groups  (Maxc=2). 
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Figure  29.  Cluster  plot  of  Case  9,  inspiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  30.  Cluster  plot  of  Case  9,  inspiration,  with  3  cluster 
groups  (Maxc=3). 


Figure  31.  Cluster  plot  of  Case  9,  inspiration,  with  2  cluster 
groups  (Maxc=2). 


Case  10  clusters  the  data  by  the  variables  FMAX  and  FLOW.  Figure  32 
shows  a  plot  of  FMAX  vs.  FLOW  for  a  maximum  of  4  cluster  groups.  This  plot 
appears  to  indicate  3  groups  instead  of  4.  Cluster  groups  3  and  4  do  not 
appear  to  be  distinctly  separate  groups.  Figure  33  shews  a  plot  of  FMAX 
vs.  FLOW  for  a  maximum  of  3  clusters.  The  plot  appears  to  have  3  separate 
groups.  The  boundaries  drawn  appear  to  be  at  FMAX  values  of  425  and  950. 
The  groups  appear  to  be  separate  from  each  other;  however,  cluster  2  is  not 
a  very  dense  group  with  some  points  scattered  close  to  cluster  3.  Figure 
34  is  a  plot  of  2  cluster  groups  and  as  in  previous  cases  cluster  group  1 
appears  to  separate  into  2  groups.  In  subsequent  cases,  results  of 
clustering  into  2  groups  may  be  found  in  Appendix  C.  In  summary, 
clustering  the  data  by  FMAX  and  FLOW  appears  to  cluster  the  data  into  3 
groups.  This  result  agrees  with  the  expiratory  data  for  the  same  case 
(FMAX, FLOW)  which  appear  to  cluster  the  data  into  3  groups. 

The  next  case  discussions  are  results  from  clustering  the  data 
according  to  the  various  combinations  of  the  indices  obtained  from  the 
power  spectra.  Cases  11  through  13  correspond  to  the  combinations  of  2 
indices,  and  case  14  corresponds  to  the  combination  of  all  3  indices.  Case 
11  clusters  the  data  by  the  indices  MPF  and  FPK.  Figure  35  is  a  cluster 
plot  of  the  data  into  4  cluster  groups.  The  data  appear  to  be  grouped  into 
4  clusters  with  boundaries  between  the  clusters  as  follows: 
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Figure  36  is  the  plot  resulting  from  trying  cluster  into  3  groups.  This 
plot  agrees  with  Figure  3,5  in  that  the  data  appear  to  be  grouped  into  4 
cluster  groups  as  opposed  to  3  cluster  groups.  Cluster  2  appears  to  be  2 
separate  groups  instead  of  1  cluster  group.  Clustering  the  data  according 
to  MPF  and  FPK,  therefore,  appears  to  cluster  the  data  into  4  groups. 
Cluster  1  (Fig.  35)  or  cluster  3  (Fig.  36)  appear  to  be  the  denser  cluster 
groups.  In  Case  12  the  data  is  clustered  by  the  indices  MPF  and  FMAX. 
Figure  37  is  a  plot  of  MPF  and  FMAX  for  4  cluster  groups.  Cluster  1  is  not 
a  dense  cluster  group.  There  appears  to  be  a  zone  between  cluster  groups  1 
and  3  in  which  there  are  data  points  assigned  to  either  cluster  group, 
signifying  that  there  is  not  a  distinct  boundary  between  these  groups. 
Also,  clusters  2  and  4  are  not  well  defined  or  distinguished  from  each 
other.  Figure  38  is  clustering  by  the  variables  MPF  vs.  FMAX  into  3 
groups.  Cluster  group  2  appears  to  be  a  separate  group  from  cluster  groups 
1  and  3.  Cluster  group  1  is  not  as  dense  a  group  as  cluster  group  2.  The 
boundaries  between  the  cluster  group  appear  to  be  as  follows: 
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Figure  33.  Cluster  plot  of  Case  10,  inspiration,  with  3  cluster 
groups  (Maxc=3). 


Figure  34.  Cluster  plot  of  Case  10,  inspiration,  with  2  cluster 
groups  (Maxc=2). 


Figure  36.  Cluster  plot  of  Case  11,  inspiration,  with  3  cluster 
groups  (Maxc=3 ) . 
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Figure  37.  Cluster  plot  of  Case  12,  inspiration,  with  4  cluster 
groups  (Maxc=4). 


Figure  38.  Cluster  plot  of  Case  12,  inspiration,  with  3  cluster 
groups  (Maxc=3). 
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In  summary,  clustering  the  data  according  to  the  indices  MPF  and  FMAX 
appears  to  cluster  the  data  into  3  groups. 

Case  13  clusters  the  data  by  the  indices  FPK  and  FMAX.  Figure  39  is  a 
plot  of  FPK  vs.  FMAX  into  4  groups.  This  plot  does  not  appear  to  show  4 
distinguishable  clusters.  Cluster  groups  1  and  4  appear  to  be  separate 
groups;  however,  cluster  group  1  has  a  dispersion  of  points  in  cluster 
groups  4  and  2.  Cluster  group  2  appears  to  be  a  separate  group,  but 
Cluster  group  3  does  not  show  a  distinct  grouping.  Figure  40  is  a  plot  of 
FPK  vs.  FMAX  into  3  cluster  groups.  This  plot  does  not  clear  up  the 
picture  in  any  way.  The  previous  cluster  groups  1  and  4  are  now  grouped 
into  cluster  group  3.  This  grouping  does  not  appear  to  give  am  accurate 
representation  of  the  clustering  of  the  data. 

In  summary,  clustering  the  data  by  the  indices  FPK  and  FMAX  does  not 
give  a  clear  distinction  of  cluster  groups  in  any  of  the  plots.  This 
grouping  does  not  agree  with  the  expiratory  data  for  the  same  case,  which 
tended  to  cluster  the  data  into  4  groups. 

The  previous  cases  studied  are  not  consistent.  Some  combinations 
appear  to  show  3  clusters;  others  appear  to  show  4  clusters,  and  some  do 
not  reveal  any  distinct  clusters  at  all.  The  next  case,  Case  14,  is  an 
attempt  to  give  a  clearer  indication  of  the  relationship  between  the  three 
indices  of  the  power  spectra  (MPF  vs.  FPK  vs.  FMAX)  by  a  three-dimensional 
cluster  plot. 

The  inspiratory  three-dimensional  plots  of  Case  14  are  analyzed  in  the 
same  fashion  as  the  expiratory  plots  of  Case  7.  The  same  angles  are  used 
for  the  plots  and  a  boundary  is  drawn  around  the  normal  subjects  of  Wong's 
study.  The  individual  plots  of  normal  subjects  and  USAF  data  are  found  in 
Appendix  D. 

In  Figure  41,  we  noted  a  similar  pattern  to  the  expiratory  plot.  The 
normal  subjects  appear  to  veer  out  at  an  angle  and  the  other  data  points 
follow  a  more  vertical  line.  There  is  some  overlapping  of  points  as  in 
Figure  24. 

Figure  42  does  not  appear  to  show  any  distinct  groups.  This  pattern  is 
similar  to  the  pattern  in  Figure  25.  Observing  the  data  from  the  viewing 
angle  of  5°  in  Figure  43  appears  to  give  the  same  conclusion  as  Figure  26. 
The  normal  data  is  clustered  centrally  and  there  appears  to  be  2  groups  of 
data  outside  the  normal  boundary.  The  viewing  angles  of  85°  and  5°  appear 
to  be  better  angles  than  45°.  From  these  2  angles,  the  appearance  of 
clusters  seems  to  be  forming.  The  viewing  angle  of  5°  in  both  Case  7  and 
Case  14  shows  a  distinct  group  at  a  higher  maximum  frequency  than  the 
normal  subjects. 
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Figure  39.  Cluster  plot  of  Case  13,  inspiration,  with  4  cluster 
groups  (Maxc=4). 


CONCLUSIONS 


The  objective  of  this  study  was  to  determine  whether  respiratory  sound 
data  of  normal  volunteers  and  pulmonary  insufficiency  subjects  reveals 
groupings  or  clusters  of  the  data.  Cases  1  through  6  of  the  expiratory 
data,  and  Cases  8  through  13  of  the  inspiratory  data,  dealt  with  1  or  2 
parameters  of  the  power  spectra.  The  plots  appear  to  indicate  that  the 
data  may  be  clustered  into  3  or  4  groups  but  not  necessarily  agreeing  in 
the  number  of  cluster  groups  for  the  same  variables  during  expiration  and 
inspiration . 

The  three-dimensional  plots  appeared  to  give  a  clearer  picture  of  the 
situation.  The  viewing  angles  of  85°  and  5°  with  a  5°  table  angle  appeared 
to  show  3  possible  cluster  groups  from  the  data. 


RECOMMENDATIONS 

A  recommendation  is  made  to  perform  a  discriminant  analysis  of  the  data 
to  establish  whether  the  clustering  that  appears  in  the  three-dimensional 
plots  actually  represents  distinct  populations. 

The  power  spectra  plots  revealed  the  appearance  of  a  bimodal 
distribution  that  was  not  evident  from  the  normal  subjects  of  Wong’s  study. 
This  may  be  a  very  important  finding  that  should  be  taken  into  account  in 
the  future  studies.  A  recommendation  is  made  to  study  a  better  method  of 
calculating  the  power  spectra  of  such  a  distribution,  and  to  repeat  the 
study  in  the  three-dimensional  realm  and  compare  the  results. 

According  to  the  paper  by  Grassi  et  al.  [19],  the  ratio  of  the 
inspiratory  and  expiratory  sound  intensity  was  used  as  an  index.  A 
recommendation  is  made  to  repeat  the  study  and  collect  sequential  samples 
of  expiratory  and  inspiratory  respiratory  sounds,  calculate  the  ratio  of 
the  power  spectra  parameters,  and  compare  the  ratio  and  the  combination  of 
the  3  parameters  of  the  power  spectra  as  estimators  for  discrimination 
between  normal  and  pulmonary  insufficiency  patients. 
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